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Abstract — We consider a number of constrained coding tech- 
niques that can be used to mitigate a nonlinear effect in the 
optical fiber channel that causes the formation of spurious pulses, 
called "ghost pulses." Specifically, if h-\bi . . . b„ is a sequence of 
bits sent across an optical channel, such that b k = bj = b,„ = 1 
for some k, I, m (not necessarily all distinct) but b k+ i_ m = 0, 
then the ghost-pulse effect causes b^ + i- m to change to 1, thereby 
creating an error. Such errors do not occur if the sequence 
of bits satisfies the following constraint: for all integers k, I, m 
such that fej- = b\ = b„, = 1, we have &jt+;_ m = 1. We call 
this the binary ghost-pulse (BGP) constraint. We will show, 
however, that the BGP constraint has zero capacity, implying 
that sequences satisfying this constraint cannot carry much 
information. Consequently, we consider a more sophisticated 
coding scheme, which uses ternary sequences satisfying a certain 
ternary ghost-pulse (TGP) constraint. We further relax these 
constraints by ignoring interactions between symbols that are 
more than a certain distance / apart in the transmitted sequence. 
Analysis of the resulting BGP(f) and TGP(t) constraints shows 
that these have nonzero capacities, and furthermore, the TGP(t)- 
constrained codes can achieve rates that are significantly higher 
than those for the correponding BGP(f) codes. We also discuss 
the design of encoders and decoders for coding into the BGP, 
BGP(f) and TGP(f) constraints. 

Index Terms — Binary ghost-pulse (BGP) constraint, capacity of 
constrained systems, constrained encoding and decoding, optical 
communication, ternary ghost-pulse (TGP) constraint. 

I. Introduction 
High data-rate optical fiber communication presents several 
interesting challenges to a coding theorist. The diverse 
impairments peculiar to the optical channel necessitate the 
development of new coding schemes, capable of mitigating 
the effects of these impairments. One such impairment 
is the nonlinear effect known as intrachannel four-wave 
mixing (FWM) — see [9], [20], [23] and references therein. 
FWM results in strong inter-symbol interference between the 
symbols in a bitstream transmitted across the optical fiber. It 
is widely accepted [19], [20], [26] that at bit rates of 40Gbps 
and beyond, FWM will play a major role in limiting the 
information-carrying capacity and the propagation distance of 
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a dispersion-managed optical communication system. In this 
paper, we consider a number of constrained coding techniques 
motivated by the intrachannel FWM effect. 

A. Background on Ghost-Pulse Formation 

In a typical optical fiber communication scenario, a train of 
light pulses, corresponding to a sequence of n bits, is sent 
across an optical fiber. Each bit in the sequence is allocated 
a time slot of duration T, and a binary one or zero is marked 
by the presence or absence of a pulse in that time-slot. The 
effect of intrachannel FWM is to transfer energy from triples 
of pulses in '1' -slots into certain '0' -slots, thereby creating 
spurious pulses known as ghost pulses. It has been observed 
that the interaction of pulses in the fc-th, Z-th, and m-th time- 
slots pumps energy into the (k+l— m)-th time-slot. If this slot 
did not originally contain a pulse — that is, if the (k+l— m)- 
th bit was a zero in the original n-bit sequence — then this 
transfer of energy creates a ghost pulse in this time-slot. This 
could cause the original zero to be read as a one (see Fig.[l}. 

Since the overall energy is conserved, some of the pulses 
in the Zc-th, Z-th, and m-th time-slots lose energy, resulting in 
a lowering of their amplitude (intensity). On the other hand, 
if the (k+l — m)-th slot already contained a pulse, then there 
is an exchange of energy between the pulses in the fc-th, Z-th, 
m-th, and (k+l— m)-th slots, leading to amplitude fluctuations. 
An analytic explanation of these phenomena can be derived 
using the nonlinear Schrodinger equation that describes pulse 
propagation in optical fibers — see [1], [2], [26]. 
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Fig. 1. Model of ghost-pulse formation due to the interaction of three pulses 

There are, in general, multiple (k,l,m) triples that result in 
the same integer k+l—m. Thus it is possible to have several 
pulse triples generating a ghost pulse at the same time-slot. 
Of course, in reality, the number of pulse triples involved in 
ghost-pulse formation at a certain time-slot is quite small. This 
is because, as one would expect from physical considerations, 
the interaction between pulses that are sufficiently far apart in 
the transmitted pulse train is weak. Indeed, for typical optical 
transmission parameters, pulses that are more than 10 to 12 
time-slots apart do not contribute significantly to the formation 
of ghost pulses [3], [19]. In any case, when multiple pulse 
triples generate a ghost pulse at the same zero time-slot, the 
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resulting ghost pulse is the superposition of the ghost pulses 
formed by each of the pulse triples. The superposition of multi- 
ple ghost pulses may result in a stronger ghost pulse, or some- 
times (due to destructive interference) in a weaker ghost pulse. 

As shown in [4], [19], the phases of the original pulses 
play a vital role in determining which pulses lose energy and 
which gain energy in the course of the energy transfer induced 
by FWM. The phase of a ghost pulse created by a given 
pulse triple depends on the phases of all the pulses in the 
triple. Thus, in the case of a superposition of multiple triples, 
the relationship of the phase of the resulting ghost pulse to 
that of all the pulses involved in its creation can be quite 
complex. Indeed, even the amplitude of a ghost pulse depends 
on the phases of the pulse triples involved in its creation, since 
superposition of ghost pulses with opposing phases at the same 
time-slot will actually suppress ghost-pulse formation. 

Physically, a ghost pulse is just another pulse of light. Thus 
it is possible that ghost-pulse formation may propagate: the 
interaction of a ghost pulse with actual pulses or other ghost 
pulses may lead to the creation of even more ghost pulses. 

Finally, it should be noted that FWM is primarily a problem 
with long-haul and ultra long-haul optical communication sys- 
tems, operating at 40Gbps. This is so because the amplitude of 
ghost pulses grows linearly with propagation distance. A long- 
haul system consists of many periods of alternating spans of 
conventional and dispersion-compensating fiber. This causes 
quasi-periodic broadening and compression of the informat- 
ion-bearing pulses. For typical transmission parameters, ghost 
pulse amplitude reaches significant proportions over several 
periods of the dispersion map that is typically 50-100 km long. 
The simulations reported in the literature [19], [23], [26] were 
carried out over links of length 500 km to 5000km. 

B. Related Modulation Techniques 

The optics literature has seen the emergence of several simple 
modulation schemes [4], [7], [16], [19] aimed at reducing 
the impact of FWM. Most of these schemes are based on the 
fact that FWM is a phase-sensitive effect and, therefore, can 
be controlled by modulating the phase of the pulses being 
transmitted. The one exception is the modulation scheme 
of [16], which proposes to use unequally spaced pulses at the 
expense of sacrificing spectral efficiency. 

Coding — that is, introduction of redundancy in the trans- 
mitted bits as a means of controlling errors — has not been 
given much consideration as an approach to mitigating the 
FWM effect. To the best of our knowledge, the only previous 
work in this area has been reported by Vasic, Rao, Djordjevic, 
Kostuk, and Gabitov in [24]. In that paper, the authors use 
sequences satisfying a certain maximum-transition run (MTR) 
constraint to counter the impact of FWM. In the language of 
constrained coding, a binary sequence x is said to satisfy an 
MTR(/) constraint if every run of ones in x has length at 
most j (cf. [22]). In the modulation scheme of [24], a block 
code of rate 0.8, consisting of 256 binary codewords of length 
10 satisfying the MTR(2) constraint, is used for transmission. 
Simulation results show significant ghost-pulse reduction due 
to the use of this coding scheme. The authors of [24] conclude 



that "it is possible to successfully tackle the detrimental effects 
of FWM in 40-Gb/s systems using simple coding techniques." 

In this paper, we undertake a systematic study of a number 
of coding schemes that combine constrained coding and phase 
modulation. Our study focuses purely on the coding-theoretic 
aspects (e.g. rate, encoding/decoding) of these schemes — we 
make no claims regarding their effectiveness in suppressing 
ghost pulses. In particular, we do not address the question 
of how well constrained coding schemes are suited to tackle 
the problem of eliminating ghost pulses in real-world optical 
systems. Such questions can only be answered via experimen- 
tation and/or extensive simulations of the fiber-optic channel, 
which is beyond the scope of this work. 

C. Binary Ghost-Pulse Constraint 

To formulate a well-defined coding problem, we model the 
formation of (primary) ghost pulses as follows. Let b\bi. . . b„, 
with bj E {0, 1}, be the binary sequence corresponding to the 
train of pulses sent across the fiber optic medium. If for some 
integers k, I, and m (not necessarily all distinct), we have 

b k = b, = b m = 1 while b k+l _ m = (1) 

then the formation of a primary ghost pulse converts the zero 
in time-slot k+l—m to a one. Note that if we can encode 
the transmitted binary sequence in such a way that Q never 
occurs, we will eliminate all (higher-order) ghost pulses 
caused by ghost-pulse propagation (discussed in Section lLAli . 
as well. For example, a sequence containing at most one 1, or 
the all-ones sequence, or a sequence of alternating zeros and 
ones all satisfy this condition. In general, we say that a binary 
sequence c\Ci...c n satisfies the binary ghost-pulse (BGP) 
constraint if for all integers k, I, m such that q- = C; = c m — 1 
and ^ k + l — m ^ n— 1, we also have c k+ i_ m = 1. It 
is clear that transmitting a sequence that satisfies the BGP 
constraint will not allow ghost pulses to be created. 

Let /bgp( w ) be the number of binary sequences of length n 
that satisfy the BGP constraint. Then the asymptotic informa- 
tion rate (or the capacity, or the entropy) of the BGP constraint 
is defined (cf. [18], [21]) as follows: 

H BG p d = lim -log 2 / BGP (n). (2) 

n^co n 

Of course, we would like Hg^p to be as close to 1 as possible, 
so that coding into the BGP constraint adds little redundancy 
to the information being encoded. However, as we will show 
in Section!!!!] a finite-length binary sequence satisfies the BGP 
constraint if and only if the ones in the sequence are uniformly 
spaced — that is, the positions of the ones form an arithmetic 
progression. It follows that there are 0(n 2 ) binary sequences 
of length n that satisfy the BGP constraint, and Hg^p = 0. 
Hence, we need to investigate alternative approaches to dealing 
with the ghost-pulse problem. 

One approach that we consider is based on the intuition that 
the interaction between pulses that are sufficiently far apart in 
the transmitted pulse train is weak. As noted in Section fTAl in 
a typical optical communication scenario, pulses that are more 
than 10-12 time-slots apart do not contribute significantly to 
the formation of ghost pulses (cf. [3], [19]). 
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Disregarding the interaction between ones that are separated 
by more than some fixed distance f, we say that a binary 
sequence C\C2 ■ . ■ c n satisfies the BGP(t) constraint if for all 
integers k, I, m (not necessarily distinct) such that 

c k = ci = c m = 1, (3) 

sC k + l-m sC n-1, (4) 

and 

max{\k-l\,\l -m\,\m-k\} ^ t (5) 

we also have c/ c+ /_ m = 1. The capacity H.2(t) of the BGP(f) 
constraint can be defined as in Q. (We will provide formal 
definitions for the capacities of all such constraints in the next 
section.) In SectionlTm we will show that H2(f) is positive 
for all t. However, we also show in SectionlliTI that H.2{i) 
lies in the range 0.21-0.25, when t E {10, 11, 12}. This 
makes the BGP(t) constraint somewhat unattractive as the 
basis for a coding scheme. Nevertheless, we briefly discuss 
in SectionlLm the design of finite-state encoders that take an 
unconstrained binary sequence as input and produce a BGP(f )- 
constrained sequence as output. 

D. Ternary Ghost-Pulse Constraint 

Another approach that has been suggested [4], [7], [19] to 
mitigate the formation of ghost pulses is to apply, at the 
transmitter end, a phase shift of n to some of the pulses. 
We can effectively think of this phase-modulation technique as 
converting a binary sequence \>-\b% ■ . . b n , with bj £ {0, 1}, into 
a ternary sequence c\Ci . . . c n , where q 6 { — 1, 0, 1}, such that 
bj — \cj\ for all i. One reason behind this phase-modulation 
approach is that, as explained in Section tTAl superposition 
of the contributions due to multiple pulse triples will result 
in suppression of ghost-pulse formation if their interference 
is destructive. Thus, knowledge of the relationship between 
the phase of a ghost pulse and the phases of the pulses 
involved in its creation makes it possible to manipulate the 
phase of the transmitted pulses in a way that encourages 
destructive interference. Such phase modulation schemes are 
very effective at eliminating some of the stronger ghost pulses 
(cf. [4], [19]). However, as observed in [4], it is impossible to 
achieve destructive interference in several consecutive zero- 
slots. Moreover, these schemes do not mitigate the "side 
ghosts" that arise due to energy leakage from the one-slots into 
adjacent zero-slots. Therefore, another approach is to modulate 
the phase of the transmitted pulses with the aim of achieving 
energy redistribution among the one-slots, thereby preventing 
energy leakage into adjacent zero-slots. Overall, building upon 
the work of [4], it appears reasonable to try preventing 
situations in which pulses in time-slots k, I, and m all have the 
same phase, while the slot at time k+l—m is empty (zero). 

Thus we say that a ternary sequence C\C2 ■ ■ -C n satisfies the 
ternary ghost-pulse (TGP) constraint if for all integers k, I, m 
(not necessarily distinct) such that ^ k + I — m ^ n—1, and 

c k = c t = c m = +1 or c k = ci = c m - -1 (6) 

we also have C/ c+ ;_ m ^ 0. Let 73 be the set of all finite-length 
ternary sequences that satisfy the TGP constraint. To transmit a 
finite-length binary data sequence, we encode it as a sequence 



from 73. Based on the discussion above, we shall assume, as a 
first-order approximation, that sequences in T$ are effective in 
mitigating ghost-pulse formation, so the transmitted sequence 
can be recovered without error at the receiver end. 

However, there is a catch. Most long-haul optical commu- 
nication systems use direct-detection optical receivers, which 
can only detect the intensity (amplitude) of the optical signal 
at the channel output, not its phase. Thus if the transmitted 
ternary sequence was c\Ci ■ ■ . c n , then the receiver only sees 
the sequence \c\ \, \c2\, ■ ■ ■ , \ c n\- In other words, the receiver 
cannot distinguish a +1 from a —1. As a result, we cannot 
use two sequences in T$ that differ only in phase (sign) to 
encode two different binary data sequences. 

We thus have a rather unusual coding problem: even though 
the sequence being transmitted is ternary, the alphabet used 
for encoding information is effectively binary. In general, 
discrete channels for which the output alphabet is smaller 
than the input alphabet are rarely encountered in information 
theory. In fact, to the best of our knowledge, a situation 
where the alphabet over which the constraint is defined is 
different from the information-bearing alphabet has not been 
previously studied in the constrained coding literature. 

In order to describe the procedure for encoding a binary data 
sequence using TGP-constrained sequences, we define the set 

#3 = f {ki|,|c 2 |,...,|c„| : cic 2 . . .c n eT 3 }. (7) 

This is the set of all finite-length binary sequences that can be 
converted to a sequence in T$ by changing certain l's to — l's. 
To transmit a binary data sequence a-[U2 ■ ■ ■ a^, we first encode 
it as a sequence . . . b„ E #3, which is then converted to 
a corresponding sequence C\C2 ■ ■ ■ c n £ 73 at the input to an 
optical channel. At the channel output, the receiver detects 
the sequence b$2 ■ ■ ■ b n , which can be uniquely decoded to 
recover the original binary sequence a-[U2 ■ ■ ■ fljv- 

The capacity Htgp of the TGP constraint can be now 
defined in a manner analogous to Let fjQp{n) denote 
the number of sequences of length n in the set S3. Then 

H T Gp d = lim -log 2 / TG p(w)- (8) 

n—>oo yi 

The analysis of the TGP constraint appears to be a much 
more difficult problem than analysis of the BGP constraint. 
However, we conjecture that Hjqp = Hbgp = 0. Strong 
evidence in support of this conjecture is given in [15] (see 
Section lTV^Al 

Consequently, we consider the weaker TGP(f) constraint 
obtained, similarly to the BGP(7) constraint, by ignoring inter- 
actions between nonzero symbols that are more than distance 
t apart. Define the set 73 ;f by adjoining the extra condition l|3 
to 0. The capacity H^(t) of the TGP(f) constraint can be 
then defined as in (jSJl, but with respect to the set 73 ; f rather 
than 73. One can reasonably expect that as t increases, H^(t) 
decreases, converging upon Hxgp m the limit as t — > 00. 
Indeed, we will prove in the next section that 

H TGP = limH 3 (f) = infH 3 (f). (9) 

This provides a means of computing increasingly tight upper 
bounds on the capacity Hxgp which, as we mentioned earlier, 
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is not easy to compute directly. Furthermore, we will show in 
SectionHVlfhat 

H 3 (l) = 1 and H 3 (2) « 0.96. (10) 

These values are significantly larger than the corresponding 
values for the BGP(t) constraint, namely 

H 2 (l) w 0.69 and H 2 (2) » 0.55. (11) 

Moreover, it appears from (II 01 and (II It that H 3 (f) decreases 
much slower with t than H 2 (t), since H 2 (l) - H 2 (2) « 0.14 
while H 3 (l) - H 3 (2) ss 0.04. Assuming that this trend con- 
tinues for larger values of f, coding schemes based on TGP(f)- 
constrained sequences can be a reasonably efficient means of 
mitigating the ghost-pulse effect in optical communications. 

Unfortunately, the techniques we use in Section HVl to deter- 
mine H 3 (l) and H 3 (2) do not easily generalize to the com- 
putation of H 3 (t) for arbitrary t. Thus we have been unable to 
verify whether the aforementioned trend continues for larger 
values of t. In Section lTV-DI we describe a general method for 
computing H 3 (f); however, this method is too computationally 
intensive to be implemented in practice. Nevertheless, we do 
discuss (also in Section TTVl the design of finite-state encoders 
for coding schemes involving TGP(t) -constrained sequences. 

Remark. Before concluding this introductory section, we note 
that it is possible to design other coding schemes that combine 
constrained coding with phase modulation in order to achieve 
ghost-pulse suppression. For example, we can conceivably add 
phase modulation to the constrained coding scheme of [24], 
thereby gaining some improvement in performance. In this 
paper, however, we have chosen to focus solely on the BGP 
and TGP constraints. The unusual nature of these constraints 
requires the development of non-standard tools for their anal- 
ysis, which may be of independent interest to coding theorists. 

II. Definitions and Preliminary Results 

In this section, we formally define the various types of ghost- 
pulse constraints that we shall be interested in. We also 
give precise definitions for the corresponding capacities, and 
establish several useful relationships between them. 

Let Z and Z + denote the set of integers and the set of 
positive integers, respectively. Given n, n' £ Z, we write 

[n,n'\ = f {ieZ : n^i^n'}. 

Note that both [n] and [n, n'] could be empty. Let A 2 = {0, 1} 
and let A3 = { — 1, 0, 1}. These are the relevant alphabets for 
the binary and the ternary ghost-pulse constraints, respectively. 
However, rather than giving definitions for the binary case and 
the ternary case separately, we find it more convenient to de- 
fine the ghost-pulse constraints over a generic q-ary alphabet. 
Thus, given an integer q 2, let Aq denote a fixed set of q 
letters, one of which is a distinguished letter 0. Although this 
is not required in what follows, a good way to think of An 
is as the set of distinct q — 1th roots of unity, augmented by 
zero. For n £ Z + , let Ag denote the set of sequences of length 



n over An. Given x = (x\x 2 ■ ■ ■ x n ) £ Aq 1 , the support of x 
is defined as supp(a;) = £ [n] : x% 7^ 0}. 

Definition 1. A sequence x £ A n „ satisfies the q-ary ghost-pulse 
(q-GP) constraint if for all k, I, m £ supp ( x ) such that 

Xfc — X\ — x m 

either k + l —mE supp(a;) or k + l — m ^ [n]. Note that the 
integers k, I, m above are not necessarily distinct. 

For n £ Z + , let Tq(n) be the set of sequences of length n 
over Aq that satisfy the q-GP constraint. Further define 

00 

T q d J f [J T q {n). (12) 
n=l 

This is the set of all finite-length sequences satisfying the q-GP 
constraint. Let £ : Aq — ► A2 be the "absolute value" function, 
defined by 

cf s def / x = 

For all n £ Z + , we extend this "absolute value" function 
componentwise to a function £ : Aq — > A\ via 

£,(xxx 2 ...x n ) = f (£,(x 1 ),£,{x 2 ) / . . .,£,{x„)). (14) 

Given such a function, we further define for all n £ Z + the 
sets Bq(n) c A\ as follows 

Bq{n) = Z(T q (n)) = {£(x) : xeT q (n)}. (15) 

Finally, we set B q = £(T ? ) = lC=i B,(n). Thus, if ^\{0} 
is indeed a set of complex roots of unity, then Bq(n), 
respectively Bq, consists of those binary sequences that can 
be transformed into a sequence in Tq(n), respectively Tq, by 
means of appropriate phase shifts. In particular, our definition 
of B3 based upon dl5> coincides with the earlier definition in 
0. 

Definition 2. For all integers q ^ 2, the capacity of the q-ary 
ghost-pulse constraint is defined by 

def Hm log 2 \Bq(n)\_ 

It should be immediately clear from the discussion above 
that H 2 = Hgcp and H 3 = Hjq?, as defined in (0 and 
respectively. The following proposition shows that all these 
capacities are, indeed, well-defined. 

Proposition 1. The limit below exists for all q ^ 2, and 

moreover 

iim log 2 | W | = , nf log 2 |g ? (^ 

n-»oo n n^l n 

Proof. This follows from the standard argument for shift 
spaces (see e.g. [18, pp. 103-104]), which we briefly reproduce 
here for completeness. Use the following test for convergence 
from elementary calculus: if a\,a 2 , ... is a sequence of non- 
negative numbers such that a n+n i a n + a n i for all n, n' 1, 
then lim^oo a n /n exists and equals inf„^i a„/n. Apply this 
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test to the sequence defined by a„ — log 2 |£L(n)|. We need 
to show that a n+n i ^ a n + a n i or, equivalently, that 

\B q {n + n')\ \B q (n)\ \B q (n')\ . (18) 

But this easily follows from the observation that if a sequence 
y E A q +n satisfies the q-GP constraint, then every contiguous 
subsequence of y also satisfies the q-GP constraint. Hence if 
{x\X\ . . . X n+n >) E Bcj(n+n'), then (xix 2 ■ ■ ■ x n ) E B q (n) and 
{x n+ \x n+ 2 ■ ■ ■ x n+n i) E B q {n'), which implies ( fT8l . ■ 

We will show in Section lLU-Al that H 2 = 0, and our analysis 
in Section lTV-Al will lead us to conjecture that H3 = as well. 
In fact, we believe that H q = for all q. This is so because the 
q-GP constraint has unbounded memory. For large n, the value 
of a sequence x E T q (n) at a given position i E [n] depends on 
the values of x at (essentially) all other positions. To obtain 
nonzero capacities, we relax the q-GP constraint by bounding 
its effective memory, as made precise in the next definition. 
As explained in Section|U it makes physical sense to do so. 

Definition 3. Let t E Z + be fixed. A sequence x E A q satisfies 
the q-GP(t) constraint if for all k, I, m E supp(a;) such that 

xi = xi = x m and max{ \k— 1\, \l — m\, \m— k\] ^ f 

either k + I — m E supp(a;) or k + l — m ^ [n]. As before, 
the integers k, I, m above need not be all distinct. 

For n, t E Z + , we let T q; t(n) denote the set of sequences of 
length n over A q satisfying the q-GP{t) constraint, and define 

Tq;t = UJJLi Tq;t{ n ) as in G3- With the hel P of the function 
<r • A n q ->■ A\ given by O and (EJ, we define 

B q;t (n) = £,(T q;t (n)) d ^ : xET q , t (n)} (19) 

and write B q; t = £,(T q} t) = U^i B q; t(n). We can now define 
the capacity of the q-GP(t) constraint as follows. 

Definition 4. For all integers q ^ 2 and t ^ 1, the capacity of 
the q-GP(t) constraint is defined by 

HJt) t f l im lQ g^ (w) ', (20) 

Exactly the same argument that we used in the proof of Pro- 
positiorflcan be now used to show that the limit in (1201 exists, 
and in fact 

H,( t ) = inf l0S2l ^ t(H)l . (21) 

Observe that, for all fixed q, the sequence H q (1), H q (2), ... is 
a nonincreasing sequence of nonnegative numbers. This is so 
because B q)t+ i(n) C B q; t(n) for all n E Z + and all t E Z + , as 
is evident from Definition|3] Therefore lim^oo H q (t) exists, 
and equals inff^i H q (t). The following proposition shows that 
this limit is also equal to H q , as defined in d!6i . 

Proposition 2. For all integers q ^ 2, 

H„ = lim HJt) = inffLCf). (22) 

Proof. Let oc q = f inf t j>i H q (t). We have already shown that 
lirrif^oo H q (t) = oc q , so it remains to prove that H q = oc q . 



It follows immediately from Definition^ and Definitional tnat 

B q - t {n) D B q {n) for all n,feZ+. Hence \B q - t {n) \ > \B q {n)\ 
and H q (t) ^ H q for all t E Z + . Letting t — > 00, we conclude 
that oc q ^ H q . For the reverse inequality, first fix an n E Z + 
and observe that B q (n) = B q;n (n). Therefore 

log 2 \B q {n)\ = log 2 \B q - n {n)\ > log 2 \B q -„{m)\ 
n n ^ m^i m 

Note that the right-hand side above is precisely H q (n) in view 
of i21\ . and H q (n) ^ tx q by the definition of <x q . If follows that 
log 2 \ B q (n)\/n ^ a q for all n E Z + , and therefore H q ^ oc q . 
This completes the proof of the proposition. ■ 

Observe that our claim in (|9) follows as a special case (for 
q — 3) from Proposition^ Thus, as discussed in Section lT-Dl 
Proposition|2]provides a means of computing increasingly tight 
upper bounds on H q . In particular, this proposition implies that 
Hjqp = H3 can be determined by studying the asymptotics 
of the sequence H^(l), Hj,(2), .... In Section llV-DI we show 
that there is indeed an algorithm that can be used to compute 
H$(t) for any given t. Unfortunately, this algorithm is too 
computationally intensive to be useful in practice. 

III. The Binary Ghost-Pulse Constraints 

Following the terminology of Section I, we shall refer to the 
q-GP constraints with q = 2 as the binary ghost-pulse (BGP) 
constraints. Such constraints can be completely analyzed, and 
the purpose of this section is to present this analysis. 

A. The BGP Constraint with Unbounded Memory 

Note that Definitions ^ and [3] become somewhat redundant in 
the binary case. For a binary sequence x = (x.\Xi . . . x n ), any 
k, I, m E supp(a;) satisfy Xj- = X; = x m . Thus the BGP con- 
straint is simply the requirement that for all k,l,mE supp(a;), 
either k + l — mE supp(x) or k + l — m<£_ [n]. The following 
theorem makes use of this observation to show that sequences 
that satisfy the BGP constraint are precisely those whose 
support set forms an arithmetic progression. 

Theorem 3. For alln E Z + , a sequence x = (x\Xi . . . x n ) E A\ 
satisfies the BGP constraint iff there exist a, d E [0, n] such that 

supp(a;) = (a + dZ) n [n]. (23) 

Proof. Suppose that x satisfies j23l >. and consider any 
ki,k2,k^ E supp(a;). Then = a + djj for some jj E Z. Set 
j = ji+jz— ]3- Then k\+ki—k^, = a + jd E (a + dX). Thus, 
either kx+k^—k^E supp(a;) or k\ + k 2 — k^, £ [n]. 

(=>) Suppose that x = (x\X\...x n ) satisfies the BGP 
constraint. If supp(a;) = 0, then we can take a — d = 
in J23l >. If |supp(a;)| = 1, then we can take a to be the 
unique integer in supp(a;) and set d = 0. Hence, it remains 
to consider the case where | supp(a;)| ^ 2. For this case, set 

d = mh\{\k — m\ : k, m E supp(a;) / k ^ m }, (24) 

and then take a to be any integer with a, a + d E supp(a;). To 
prove that x satisfies d23l > with this choice of a and d, we will 
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first show that (a + dZ) D [n] C supp(a;), and then prove 
that every element of supp(a;) must also be in a + dZ. 

Claim 1: (fl + dZ) fl [n] C supp(a;). In order to establish 
this claim, suppose that 



{fl + 



< £ < ;' } C supp(a;) 



(25) 



for some i ^ and j ^ 1. By our choice of a and d, we know 
that j25t certainly holds for i = and j = 1. Observe that 



a + (z'-l)d = (a + id) + (a + id) - (a + (i+l)d) 



and 



a + (;'+l)d = (a + jd) + (a + jd) - (a + (j-l)d). 

Hence, if x satisfies the BGP constraint, then a + (z — l)d 
and a + (/+l)d belong to supp(a;), provided only that these 
positions are in [n] . In other words, we can grow the arithmetic 
progression on the left-hand side of d25l in both directions, as 
long as it fits inside [n], and the claim follows. 



a+(j-l)d, k a+jd a+(j-l)d k a+jd a+(j-\)d k , a+jd 
*-\ — x — • * • — x — • S • x » 



Fig. 2. Various possibilities for the choice of k £ supp(ai) with k ^ (fl + dX) 

Claim 2: supp(a;) C (fl + dZ). Assume to the contrary that 
there is a k G supp(a;) with k £ (fl + dZ). Then we must have 



(/-l)d < k < a + jd 



(26) 



for some j G Z such that at least one of fl + (;'— l)d and fl + yd 
lies in [n] (cf. Fig. 2). Without loss of generality (w.l.o.g.), sup- 
pose that (fl + jd) G [n]. Then (fl + jd) G supp(x) in view of 
Claim 1 . But the difference between a + jd and is strictly less 
than d by J26t . which contradicts the definition of d in d24> . 

By Claim 1 and Claim 2, we have supp(a;) = (fl + dZ)n[n], 
which completes the proof of the theorem. ■ 

Corollary 4. There are at most (n + l) 2 sequences in A% that 
satisfy the BGP constraint, and therefore 



H BGP d ^ f lim l0g2|g2(M)l ^ lim lGg2(n + 1)2 

?7^oo fl n— >oo fl 



0. 



Proof. There are (n + l) 2 different ways of selecting the 
integers fl and d from [0, n) . By Theorem|3] every sequence 
in B^in) is uniquely determined by one such choice. ■ 

In fact, using Theorem|3las a starting point, a more careful 
analysis of the possible choices for fl and d shows that 



\B 2 (n)\ = 



i(n + 2)(n + 2) n even 
i(n + l)(n + 3) n odd. 



(27) 



We leave the proof of this expression as a straightforward, but 
tedious, combinatorial exercise for the reader. 



B. The BGP(t) Constraints 

We next take on the analysis of the BGP(f) constraint, for 
arbitrary t G Z + . We will show that the BGP(f ) constraint is 
closely related to the well-known [t, oo) constraint. A binary 
sequence x is said to satisfy the {t, oo) constraint if there are 
at least f zeros between any two ones in x. We use St l0 o{n) 
to denote the set of all (f, oo) -constrained binary sequences 
of length n. Such sequences have been extensively studied in 
the constrained coding literature [12]— [14], [18], [21]. The next 
theorem shows that the set B%-t (w) of all sequences in A% that 
satisfy the BGP(t) constraint is not much larger than 5f /00 (n). 

Theorem 5. Let Qt(n) denote the set of all sequences x G A% 
suchthat supp(a;) = (fl + dZ) n [n] for some a and d in [0,t]. 
Then for all n, t G Z + , we have 



B 2 ; t {n) = St, J^) U Qt(n). 



(28) 



Proof. It is easy to see from (the proof of) Theorem|3] that 
Qt(n) C 02 ; f(«). Note that if x G St t Jn), then @ cannot 
be satisfied by any k,l,m G supp(a;). Hence, by Definition|3] 
all x G St,Jn) also belong to Bi^in). It follows that 



«Si»UQf(«) C S 2;f (n). 



(29) 

To establish the inclusion in the other direction, it would 
suffice to show that 



[B 2 . t (n)\S t ,Jn)) C Q t (n). 



(30) 



Thus consider an x G (B2;t(n) \St t Jn)) . Since a; ^ St,Jn), 
there exist distinct /c, m G supp(a;) with \k — m\ ^ Define 



, def 

d = min-i 



i{|fc — m| : k, m G supp(a;), k^m} (31) 

as in J24L and note that 1 ^ d ^ t. As in Theorem|3] let fl' 
be any integer with fl', fl' + dG supp(a;). Then exactly the 
same argument we used in the proof of Theorem|3] shows that 



supp(cc) = (fl' + dZ) n [n]. 



(32) 



Finally, set fl = fl' mod d. Since d ^ f in ( I31l l, we obviously 
have fl G [0, f-1]. But fl' + dZ = fl + dZ, so (03 implies that 
supp(a;) = (fl + dZ) n [n]. Thus x G Qt(n), as desired. ■ 

Let C(f, oo ) denote the capacity of the (t, oo) constraint, gi- 
ven by C(f, oo) = limn-»oolog 2 |<Sf /C »(n)|//i. It is well known 
(see e.g. [12, p. 88]) that C(f, oo) = log 2 Pt, where pt is the 



f+l 



1. It is 



largest-magnitude root of the polynomial z 
also known that this root is always real, irrational [5], and lies 
in the open interval (1,2). Thus < C(f, oo) < 1. 

Corollary 6. Let pt denote the largest-magnitude root of the 
polynomial z t+1 — z f — 1. Then for all t G Z + , the capacity of 
the BGP(t) constraint is given by 

H 2 (t) = C(f,oo) = log 2 Pt . (33) 

Proof. This follows immediately from Theorem|5] By J28b . 
we have \S t ,Jn) \ ^ \B 2 ;t(n)\ ^ |<S i/00 (n)| + |Sf(«)|- Note 
that |Qt(n)| ^ (f+l) 2 , as there are (f+l) 2 different ways of 
choosing fl, d G [0, t]. The corollary now follows from ( 1201 . ■ 
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TABLE I 

Capacity of the BGP(f) constraint for t = 1,2, ... ,20 



t H 2 (t) 


f H 2 (f) 


f H 2 (f) 


t H 2 (t) 


1 0.6942 

2 0.5515 

3 0.4650 

4 0.4057 

5 0.3620 


6 0.3282 

7 0.3011 

8 0.2788 

9 0.2600 

10 0.2440 


11 0.2301 

12 0.2180 

13 0.2073 

14 0.1977 

15 0.1891 


16 0.1813 

17 0.1742 

18 0.1678 

19 0.1618 

20 0.1564 



It is well known [12, p. 89] (and obvious) that pt decreases 
as t increases. Moreover lirrif^oo log 2 pt = 0, which by 
Lemma|2]provides an independent confirmation of Corollary|4] 

For reference, we list in Table|I]the value of H2(f ) = log 2 Pt, 
rounded to four decimal places, for all t — 1,2, . . .,20. As 
can be seen from this table, H 2 (f) is less than 0.25 for all 
t 10. This means that codes consisting of sequences that 
satisfy the BGP or the BGP(t) constraints are not particularly 
efficient means of mitigating the ghost-pulse problem. 

C. Coding Into the BGP Constraints 

Nevertheless, it may still be of interest to suggest methods 
for encoding an arbitrary binary sequence into a sequence 
satisfying the BGP or the BGP(t) constraints. 

For the BGP constraint, Theorem|3] and (1271 give a precise 
enumeration of all the sequences in Bi(n). Thus unconstrained 
binary data can be mapped into BGP-constrained sequences 
using an enumerative coding technique [8]. 

In principle, enumerative coding can be also used to code 
into the BGP(t) constraints. However, this requires precise 
enumeration of the sequences in Bi f t{n) f° r eacn n £ Z + . Un- 
fortunately, Theorem|5] does not yield a simple formula for 
computing |£>2 ; t(«)| as a function of n and t. Thus, enumer- 
ative coding would be unnecessarily complex in this case. 

We can code into the BGP(t) constraint with significantly 
lower complexity if we are willing to suffer a marginal loss 
in coding rate. When n is sufficiently large, we can ignore the 
contribution of Qt(n) to Bi-t{n) for all practical purposes. 
Observe that when t is fixed, |<2f(n)| is bounded by the 
constant (t+1) 2 while |<Sf /00 (n)| grows exponentially with n. 
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Fig. 3. A rate 2:3 sliding-block decodable encoder for the (l,oo) constraint 

Coding into the (f, oo) constraint is a very well-studied 
subject [13], [14], [18, Chapter5], [21]. For all positive 
integers p and q with p/q < C(t, oo), there is a rate p:q 
finite-state encoder for the (t, oo) constraint, meaning a finite- 
state machine that generates an output block of q bits for 
every input block of p bits, and converts unconstrained binary 
sequences into sequences that satisfy the (t,oo) constraint. 



For example, the graph in Fig. 3 is a rate 2:3 two-state encoder 
for the (1, oo) constraint. Such rate p : q encoders can, in fact, 
be designed so that the constrained sequences they generate 
are amenable to decoding with a sliding-block decoder [21, 
Theorem3.35]. For example, the encoder in Fig. 3 is indeed 
sliding-block decodable: a description of the corresponding 
sliding-block decoder can be derived from [18, Example 5.5.5]. 

It is well known [5] that the capacity C(t, oo) is irrational 
for all t ^ 1. Thus the design and the implementation of 
rate p : q encoders necessarily becomes more cumbersome as 
the rate p/q approaches capacity. Consequently, in situations 
where variable-rate encoding and state-dependent decoding 
are acceptable, the constrained coding technique of [6], [17], 
known as "bit-stuffing," is an attractive alternative. The bit- 
stuffing encoder comprises two components. The first is an 
invertible distribution transformer that converts a sequence of 
i.i.d. equiprobable information bits into a sequence of i.i.d. 
biased bits, with the probability of a zero given by a prescribed 
value p. The second component inserts (stuffs) a string of f 
consecutive zeros following every one in this biased sequence. 
The decoder simply discards the string of f zeros that follows 
each one, and then applies the inverse of the distribution 
transformer. It can be shown [6] that, if the parameter p is 
optimized, the average rate of the bit-stuffing encoder equals 
the capacity C(f,oo). 

IV. The Ternary Ghost-Pulse Constraints 

It happens to be much harder to analyze the TGP and TGP(t) 
constraints than their binary counterparts BGP and BGP(t). 
Nevertheless, we will attempt to do so in this section. 

A. The TGP Constraint with Unbounded Memory 

In order to gain some understanding of the structure of finite- 
length TGP-constrained binary sequences, we extend the defi- 
nition of the TGP constraint in a natural way to bi-infinite se- 
quences — that is, sequences indexed by the set of integers Z. 

Deflnition5. A bi-infinite sequence x = {xj} over the 
ternary alphabet A3 = { — 1,0, 1} is said to satisfy the TGP 
constraint if for all k, l,meZ such that x^, X;, and x m are equal 
and nonzero, we also have Xj. + /_m 7^ 0- 

Let 7^* denote the set of all bi-infinite ternary sequences 
satisfying the TGP constraint, and let B^ = £,(T^) denote 
the set of all binary bi-infinite sequences that can be converted 
to a sequence in T£ by changing some of their l's to — l's. 
Using results from a branch of mathematics known as Ramsey 
theory [10], we have shown in [15] that any y e S3 is almost 
periodic: it differs from a periodic sequence in at most two 
positions. Based on this and other results, we conjecture that 
the capacity Htgp = H3 of the TGP constraint is zero. 

In Tableim we list the number of sequences in B^{n) for 
all n = 1, 2, . . . , 32. All the values in Tableim have been found 
by exhaustive computer search. We then used these values to 
plot log 2 \B^(n)\/n as a function of n in Fig. 4. As can be 
seen from this plot, the value of log 2 \B^(n)\/n decreases 
steadily as n increases, lending some further credence to our 
conjecture that Htgp = li m «^oo log 2 \B^,(n)\/n = 0. 
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TABLE II 

Values OFB 3 (n) for n = 1,2,..., 32 



f 


1 K l™\ 
l#3 W\ 


t 


ir f«i 


t 


°3(,«j| 


t 


1 R f*i\ 1 


1 


2 


9 


240 


17 


2591 


25 


11497 


2 


4 


10 


358 


18 


3245 


26 


13427 


3 


8 


11 


501 


19 


3977 


27 


15521 


4 


16 


12 


705 


20 


4881 


28 


17952 


5 


32 


13 


937 


21 


5850 


29 


20498 


6 


60 


14 


1248 


22 


7026 


30 


23449 


7 


100 


15 


1609 


23 


8313 


31 


26590 


8 


162 


16 


2078 


24 


9860 


32 


30193 




5 10 15 20 25 30 35 



Fig. 4. Plot of log 2 |£>3(n)|/« as a function of n, for n = 1,2, . ..,32 
Z?. The TGP(l) Constraint 

For the degenerate case t = 1, things remain simple. It is 
easy to show that the set £>3 ; i(ft) of all the binary sequences 
that satisfy the TGP(l) constraint is, in fact, the entire space 
A\. This is based upon the following simple observation. A 
ternary sequence (xi*2 • • • %n) is in T 3} \(n) if and only if the 
following holds: for all k £ [n] such that 

*k = Xfc+i = +1 or x k = x k+1 - -1 (34) 

we have x k _i ^ if (fc-1) £ [n] and x t+ 2 ^ if (fc+2) £ [n]. 
On the other hand, it is easy to allocate signs to any binary 
sequence in such a way that (I34> never holds. In what follows, 
we will often use + and — to denote +1 and —1, respectively. 



0/0 1/+ 0/0 




1/- 



Fig.5. Simple rate 1:1 two-state encoder for the TGP(l) constraint 

Theorem 7. For all n £ Z + , we have By r \ (n) = A\ and there- 
fore the capacity of the TGP(1) constraint is H 3 (l) = 1. 

Proof. Given any sequence y £ the following encoding 
rule converts y to a ternary sequence x satisfying the TGP(l) 
constraint: label the ones in y with alternating signs. More 
precisely, if we think of y as the input to the rate 1 : 1 encoder 



in Fig. 5, then x is the output of the encoder. To see that x 
indeed satisfies the TGP(l) constraint, note that the alternating 
signs rule guarantees that j34t never occurs. ■ 

Observe that, in addition to its use in the proof of Theo- 
remQ the encoder of Fig. 5 gives a practical method by which 
an arbitrary finite-length binary sequence can be transformed 
into a ternary sequence satisfying the TGP(l) constraint. 

C. The TGP(2) Constraint 

For t = 2, things become much more interesting. Our 
main result for this case is the characterization of the set 
$3 ; 2= £ (T~3 ; 2) of all finite-length binary sequences that satisfy 
the TGP(2) constraint in terms of a small number of forbidden 
blocks. 

To make this precise, let us first clarify our use of the term 
sub-block. We say that a sequence (x'^x^ ■ ■ ■ x' m ) is a sub-block 
of the sequence (x\X2 ■ ■ ■ x n ) if there exists an i £ [0, n—m] 
such that (x'^2 ■ ■ ■ x' m ) = (x !+ iX !+ 2 . . . *;+ m ). Now, let 

JF(2) d = {(011100), (001110), (001111100)} (35) 

and let Sf(2){n) be the set of all binary sequences of length n 
that do not contain any element of F(2) as a sub-block. Our 
main result in this subsection is the following theorem. 

Theorem 8. For all n £ Z + , we have 

B 3 ;2(n) = S m {n). (36) 

We split the proof of d36i into two lemmas: one shows that 
Sr(2){n) C ,E?3 ; 2(w), the other establishes B^in) C S^( 2 )(n). 
One of the two directions is easy, as the next lemma shows. 

Lemma 9. For all n £ Z + , we have 

B 3 -,2(n) C S m (n). (37) 

Proof. We need to show that none of the sequences in £>3 ; 2 
contains any of the three sequences in !F(2) as a sub-block. 
Consider first the sequence (011100) £.F(2). The three ones 
in (011100) can be labeled in 2 3 different ways by +/ — 
to produce ternary sequences. However, noting that a ternary 
sequence x satisfies the TGP(2) constraint if and only if so 
does the sequence —x, it is enough to consider the following 
four labelings of (011100): 

(0+++00), (0++-00) 

(0+-+00), (0+ — 00). (i *> 

It can be verified by direct inspection that none of the four 
sequences in J38i satisfies the TGP(2) constraint. Hence, none 
can be a sub-block of a sequence in 7^2, which implies that 
(011100) cannot be a sub-block of a sequence in £>3 ; 2- The 
other two forbidden blocks in !F{2) can be disposed of in the 
same way. ■ 

To establish inclusion in the opposite direction, we describe 
an encoding rule that takes an arbitrary sequence y £ Sjr^ 2 ) (n) 
and assigns a +/— labeling to the ones in y in such a way that 
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the resulting ternary sequence satisfies the TGP(2) constraint. 
More precisely, we construct a function 



(39) 



n=l 



such that £,(W(y)) = y for all y in the domain of W and, fur- 
thermore, ^(y) £ T^-2(n) for all y 6 S^ 2 )(n). This function 
W will be based upon the alternating signs idea of Theorem0 
however, a much more careful analysis is now required. 

The first step in the construction of W consists of decompos- 
ing a binary sequence y into its maximal runs. Henceforth, we 
use 0^ and V to denote the all-zero and the all-one sequences 
of length j, respectively. Any finite-length nonzero binary 
sequence y can be written uniquely in its maximal-run form: 



y = (0 fl ° l h 0" 1 l b2 o" 2 ■ ■ ■ o a >-n br 0" r ] 



(40) 



for some r 1, where u\, a.%,.,., fl r _i and h\,bi, . . . ,b r are 
positive integers while uo,a r ^ 0. Each of the r sub-blocks 
l b ' of y is called a maximal run of ones in y. 

The next step is to convert maximal runs into sequences 
over the alphabet {+,—}■ Specifically, we define the function 
$ ■ U,=i{l ; '} -» U y =i{+, -V as follows: 

tKi 1 ) = + , <Ki 2 ) = +-, <Ki 3 ) = 
^(i 4 ) = +— +, ip(i 5 ) = 



-+ 



-++- 



-++- 



(41) 



where the last expression above applies for all j ^ 7. Observe 
that i/>(l/) is a sequence of length j, so that £ np(V)j — V . 
More importantly, ip(l ! ) satisfies the property described in the 
following lemma. 

Lemma 10. Let j be a positive integer other than 3 or 5, and let 

ip{V) = {x\%2 ■ ■ ■ Xj). Then for all k,l,m£ [j] such that 

X] ( — xi = x m and maxj \k— 1\, \l— m\, \m— k\ } ^2 (42) 

we have k + I — m £ [j] as well. Thus Xk+i- m 7^ and, more- 
over, if {x\X2 ■ ■ ■ Xj) is a sub-block of a TGP (2) -constrained 
ternary sequence x of length n, then this sub-block does not 
impose constraints on any of the other n — j positions in x. 

Proof. The fact that fc + 1 — m G [j] whenever J42i is satis- 
fied follows by direct inspection from (14-11 . ■ 

Now let y £ A% be an arbitrary binary sequence of length n. 
If y = 0" or y = 1", we simply set M'(y) = y. Otherwise, 
we decompose y into its maximal runs as in (I40> . and set 



(43) 



where Xi £{+,—}' are defined by the following iterative 
procedure: 



x x = 



+H — if a = and b\ — 3 
t/>(l J ) otherwise 

t/;(l fc ') if last symbol of a;/_i is — 
— \j}(l b <) if last symbol of a;,_i is + 



(44) 



(45) 



for all i = 2, 3, . . . , r, but with two exceptions. If b, = 5 while 
fl; £ {0, 1}, we modify the expression for Xj as follows: 



Xj = 



1 — if last symbol of cc;_i is — 

~H h if last symbol of Xj^\ is + . 



(46) 



3), 



(47) 



Finally, if y ends with 0111 (that is, if a r = and b r 
then we also modify the expression for x,- as follows: 

H if last symbol of a;,--l is — 

— H — h if last symbol of a;,-_i is + . 

Observe that (1441 - ( 147 > iteratively determine cci, a;2, . . . , av in 
such a way that the first symbol of Xj is always opposite in 
sign to the last symbol of a^_j, for all f = 2, 3, . . . , r. This is 
the appropriate generalization of the alternating signs rule of 
Theorem[71 for the case of the TGP(2) constraint. 

Lemma 11. The function^ 1 defined by equations J40I - J47i has 

the following properties: 

PI. Forally eAj, wehave£,(W(y)) =y. 

P2. For all y £ «S^ (2) (n), wehave¥(y) <ET 3 . 2 (n). 

Proof. Property PI means that W converts a given binary se- 
quence y to a ternary sequence solely by assigning +/— labes 
to the ones in y. This should be obvious from the fact that 
£(i/>(17)) = 1 ; and our construction of ¥ in d43l — (l47l. 

To establish property P2, consider an arbitrary y £ Sjr( 2 ) (n) 
and let x = {x\x 2 ■ ■ .x n ) denote its image x i'(y) under W. We 
need to show that x satisfies the TGP(2) constraint. Clearly, if 
y £ {0", 1"}, then x = y trivially satisfies the constraint. We 
therefore assume that y ^ {0",1"}, which implies that x is 
given by (I43i . Now, let k,l,mE supp(a;) and suppose that 



x k = x l = x m and max{ \k— 1\, \l- 



m\, \m 



-k|}<2. 



We will further assume w.l.o.g. that k ^ I ^ m. Clearly, 
either Xj. and x m come from the same sub-block Xj of x in 
( I43> . or they belong to distinct sub-blocks x t and Xj. This 
leads to two cases, which we consider next. 

Case 1: xj- belongs to Xj while x m belongs to Xj, with 7^ 
Since distinct sub-blocks in J43I are separated by at least one 
zero, the only way that \m — k\ ^ 2 can be satisfied is if 
Xk is the last symbol of Xj whereas x m is the first symbol 
of Xj + i. But then the alternating signs rule implemented in 
(1441 — d47l guarantees that %k 7^ x m . We have thus arrived at 
a contradiction. This implies that %k and x m (and, hence, also 
Xj) must belong to the same sub-block Xj of x in ( I43l >. 

Case 2: Xk,x\,x m belong to the sub-block Xj of length j. 
First suppose that j ^ {3,5}. Then J44l > - (l47l guarantee that 
Xj = ip(l ! ) or Xj = —ip(V). For this case, LemmafTol 
implies that x^ + ;_ m , Xj. +m _;, and X; +m _j- also lie within Xj. 
This, in turn, guarantees that they are all nonzero, which is 
in agreement with the TGP (2) constraint. We are thus left to 
deal with the situation where j = 3 or j = 5. This is precisely 
where the forbidden blocks in J-(2) come into play. 
Case 2.1: The sub-block Xj is of length j = 5. 
The key point is that the binary sequence (001111100) 
never occurs as a sub-block of y. Hence Xj never appears 
in the context ■ ■ • 00 x t 00 ■ ■ ■ . Note that the only relevant 
context for the TGP(2) constraint consists of the two 



10 



Revised version, submitted to the IEEE TRANSACTIONS ON INFORMATION THEORY, February 1, 2008 



symbols immediately before ai, and the two symbols 
immediately after a;,-. The fact that (001111100) does 
not occur in y together with the encoding rules in J41t - 
( 147 \ guarantee that Xj appears as follows in all of its 
possible contexts: 



(+-++-00- 
(0+-++-00- 
•+0-+ — +00- 
— 0+-++-00- 
•■00+ — +-0+- 
-00+ — +-) 
••— 0+ — +-) 
■■■00+ — +-0) 



(+-++-0+- 
(0+-++-0+- 
-+0-+ — +0— ■ 
■•— 0+-++-0+- 
...00-++-+0— 
-00-++-+) 
-+0-++-+) 
00-++-+0) 



(48) 



■■■-0+ — +-0) -+0-++-+0) 

where '(' and ')' signify the beginning and the end of the 
entire sequence x =W(y), respectively. It is now easy to 
verify by direct inspection that each of the 18 sequences 
in J48t satisfies the TGP(2) constraint. 

Case 2.2: The sub-block x t is of length j = 3. 
Similarly to the previous case, the fact that (011100) and 
(001110) do not occur in y together with the encoding 
rules in (I41t -( l47i guarantee that x\ appears as follows 
in all of its possible contexts: 



(++-00- (++-0+- (0+-+0— • 

--0+-+0- +0- + -0+- 

■■■00+ — ) -+0-++) ■■— 0+-+0) 
-00-++) ■•— 0+ — ) -+0-+-0). 



(49) 



Again, it can be verified by direct inspection that each of 
the 11 sequences in ( I49> satisfies the TGP(2) constraint. 
Since our analysis in Cases 1 and 2 is exhaustive, this estab- 
lishes property P2 and completes the proof of the lemma. ■ 

LemmafTTI shows that every sequence y £ Sf( 2 )(n) can be 
converted to a ternary sequence in 73 ;2 (n) by assigning +/ — 
labels to the ones in y. This implies that iSf( 2 )(n) C By^iri), 
by the definition of B^-^n) in J19i . Together with Lemma[9] 
this completes the proof of Theorem|8] The next corollary uses 
this result to determine the capacity of the TGP(2) constraint. 

Corollary 12. Let p denote the largest-magnitude root of the 



polynomial z 10 — 2z 



2z 3 



2z + 1. Then 



the capacity of the TGP(2) constraint is given by 
H 3 (2) = log 2 p « 0.96048. 



(50) 



Proof. We will use the results of Wilf [25] and of Guibas and 
Odlyzko [11], which provide a much more efficient means to 
compute the capacity of a constraint from its set of forbidden 
blocks than the standard methods (briefly discussed at the end 
of this subsection). Let go = 1, and for n £ Z + , define 



gn 



def 



l#3;200l = \Sr(2){n)\ 



Further, define the generating function G(z) = LJ?Ln,?" z "■ 
Using Theorem 1 of [11], we find that G(z) is given by 



G(z) 



z(z + l)(z 8 -z 7 + 



z 7 + z 6 



z 5 + z 4 - z 2 + 2z 



It can be easily verified (using, say, Matlab or MATHEMAT- 
ICA) that the largest-magnitude pole of G(z) is the unique 
largest-magnitude root of its denominator polynomial. More- 
over, this root p is real and simple. It now follows from the 
theory of generating functions due to Wilf [25, Chapter 5] that 
g n = a p" (l + o(l)) for some constant a > 0. Consequently, 



H 3 (2) 



lim 

n— >oo 



log 2 |# 3 ;2(w) 



lim 

n— >oo 



log 2 p. 



z 10 - 2z 9 + z 5 - z 4 + 2z 3 - z 2 - 2z + 1 



n n— >oo n 
Using the MATHEMATICA software package, we have found 
that p » 1.94596, and therefore H 3 (2) w 0.96048. ■ 

Observe that H 3 (f) is much larger than H 2 (f) for f = 1,2, 
as can be seen by comparing Table|l] with Theorem^] and 
CorollaryEl Furthermore, the drop from H 3 (l) to H 3 (2) is 
significantly smaller than the drop from H 2 (l) to H 2 (2). As 
mentioned in Section ll-Dl if this trend continues for larger val- 
ues of f , we can have reasonably efficient codes that, under the 
simplifying assumption of that section, mitigate the formation 
of ghost pulses in a typical optical communication scenario. 

To conclude our discussion of the TGP(2) constraint, we 
comment upon the design of encoders for converting arbitrary 
binary sequences into TGP(2)-constrained ternary sequences. 
The function W constructed in (1411 - (147 \ provides an explicit 
method of transforming sequences in <S^-( 2 )(n) = B^,- r i(n) into 
sequences in T^-iiri). However, this function does not work 
for arbitrary binary sequences: if y ^ Sjr^{n), then ^(y) 
is not necessarily in 73 ;2 (n). Thus, we still need to design 
an encoder that converts an arbitrary (unconstrained) binary 
sequence to a sequence in the constrained system 

def °° 

s m = U 5 -H2)0)- 

n=l 

The theory of constrained coding provides a standard way 
to design such encoders, which we briefly outline in what 
follows. Let Q be a finite, labeled, directed graph. We say that 
Q is a presentation of a constrained system S if S is the set 
of all sequences obtained by reading the labels of all finite 
paths in Q. A presentation Q of S is deterministic if at each 
vertex of Q, the outgoing edges are labeled distinctly. Given 
a deterministic presentation of S along with integers p and 
q such that p/q is less than or equal to the capacity of S, 
there is a systematic algorithm [21, Section 4] for designing a 
rate p:q finite-state encoder for S along with a corresponding 
decoder. Thus, to construct a finite-state encoder for our 
constrained system S^m = ^3;2> a H we need to do is provide 
a deterministic presentation for S F ( 2 )- From this, the desired 
encoder can be generated via the algorithm mentioned above. 

It may be verified that the graph in Fig. 6 is a deterministic 
presentation of cS?r(2)- Hence, it can be used as the starting point 
for the design of encoders that convert unconstrained binary 
sequences to sequences in i3 3;2 (and then, via the function W 
in (I41> - d47l . to sequences in 73 ;2 ). In fact, the graph in Fig. 6 
is the minimal deterministic presentation (also known as the 
Shannon cover) of i3 3; 2, in the sense that it has the least num- 
ber of vertices among all deterministic presentations of By f i. 

While on the subject of deterministic presentations, let us 
state the following well-known fact [21, Theorem 3.12], which 
will be needed in the next subsection. If Q is a deterministic 
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Fig. 6. A deterministic presentation of the constrained system <Sjr( 2 ) = B3 ; 2 

presentation of a given constrained system S, then the capacity 
of S is log 2 /\(A<j), where A(Ag) is the largest eigenvalue 
of the adjacency matrix of Q. Incidentally, this provides 
an alternative proof of Corollarvll2l since the characteristic 
polynomial of the adjacency matrix of the graph in Fig. 6 is 
precisely z 10 - 2z 9 + z 5 - z 4 + 2z 3 - z 2 - 2z + 1. 

D. The TGP(t) Constraints for t 3 

It is clear that the painstaking analysis presented in the 
previous subsection cannot be easily extended to the TGP(t) 
constraint for an arbitrary feZ + . Instead, we suggest an 
alternative, systematic approach to tackle the general case, 
which can, in principle, be programmed into a computer. 

The approach developed in this section has two main 
disadvantages. First, instead of computing H^(t) we end up 
with a slightly different quantity 

H >(t) lim l0g2 '^ (n)l (51) 

where B' 3 . t (n) is the set of all binary sequences of length n that 
can be extended to a bi-infinite sequence without violating the 
TGP(f) constraint (more precise definition to follow shortly). 
This is not much of a problem, since H' 3 (t) ^ H^(t) for 
all t and there are good reasons to believe that Hj(t) = 
Hs(t) for all t (see the remark below). The second problem 
is the computational complexity of the proposed approach. 
Unfortunately, this complexity is doubly-exponential in t. In 
fact, in order to compute H'^(t) one needs to construct a graph 
with at least 2°-( 9 ) vertices. Thus the proposed approach is 
not practical even for t = 2. Nevertheless, we believe that this 
approach has conceptual value, and sheds additional light on 
the underlying structure of the TGP(f) constraint. 

The general idea behind our approach is to develop a 
procedure that, given a f£Z + , generates a deterministic 
presentation Ti.^ ; t of the constrained system 

B' 3;t = U (52) 

n=l 

In developing our results, it would be much more convenient 
to deal with bi-infinite sequences. This eliminates the "edge 
effects" present at the beginning and end of a finite sequence, 
which could be quite bothersome (for example, much of the 
effort in describing the encoding rule W of the previous subsec- 
tion — see J44L J46i . ( 147 \ — was devoted to such edge effects). 



Recall that 7^* was defined in Section lTV-Al as the set of bi- 
infinite ternary sequences satisfying the TGP constraint. We ex- 
tend this definition in the natural way to the TGP(£) constraint. 

Definition 6. A bi-infinite sequence x = {x,} j e z over ^ e 
ternary alphabet A3 = { — 1, 0, 1} is said to satisfy the TGP(t) 
constraint if for all k, l,meZ such that 

max{ \k — l\, \l — m\, \m — k\] ^ t, 

whenever xj-, x\, x m are equal and nonzero, then Xjt+/-m IS a ^ so 
nonzero. We let T£ t denote the set of all bi-infinite ternary se- 
quences satisfying the TGP(t) constraint, and let B^. t = £,{T^. t ) 
denote the set of all bi-infinite binary sequences that can be 
converted to a sequence in T£ t by negating some of their ones. 

We now construct a deterministic presentation for 7^* ( . Given 
a £ £ Z + , define a finite, labeled, directed graph Qz-,u as 
follows. The set of vertices of Q^-t is the set of all 

x = (x-tx-t+i ■ ■ ■ x^ixqXi . . . xzt-ixzt) e Af +1 

that satisfy the following condition: for all k,l,m€ [0, t] such 
that Xj-, X;, x m are equal and nonzero, we also have Xj. + /_„,7^ 0. 
Note that the position indices k, I, m are restricted to the 
interval [0, t] in the above condition. This implies that Q^-f 
has at least 3 2t vertices; for example all the sequences of the 
form 

(x_fX_ f+ i . . . X_i 00 ... X t+ iX f+ 2 . . . X2t) 

are vertices of Q^- t . In fact, the order (number of vertices) of 
C?3 ; t is probably closer to 3 3t than to 3 2t (however, when f is 
small, the vertices of Qy t can still be enumerated by exhaustive 
computer search). The edges of are defined as follows. For 
each pair of vertices 

x = (x_ t x_ t+ i . . . *2f) ar, d x' = (x'_ t x'_ t+1 . . . x 2t ) 

where x and x' are not necessarily distinct, we draw a single 
directed edge from x to x' if and only if the last 3f symbols 
of x are equal to the first 3f symbols of x', that is if 

( x -t+l x -t+2 ■ ■ ■ x 2t) — (x'-tX'-t+l • " X 2t-V • 

The label of this directed edge is the symbol x 2f . This 
completes our construction of the graph Q^ t . 

Given a finite, labeled, directed graph Q, the sofic shift of 
Q is the set of all bi-infinite sequences obtained by reading 
the labels of bi-infinite paths in Q. One of our main results in 
this subsection is the following theorem. 

Theorem 13. Let denote the sofic shift of the graph Q^t- 
Then, for all t 6 Z + , we have 

*3;t = T 3 *f (53) 

Proof. We first show that 7^* f C Xj,- t . Consider any element 
x — { x j}j<zz °f ^3*f F° r ai l j ^ ^> l et x j denote the sub-block 
(Xj-fXj-t+i ■ ■ ■ Xj + 2t) or x - Since x satisfies the TGP(f) 
constraint, it follows from our construction of Q^^ that x; is a 
vertex of Q 3; t for all j G Z. Moreover, since the last 3f symbols 
of Xj_i are obviously equal to the first 3f symbols of x;, the 
graph C?3 ;f has a unique edge e,- from x^\ to xj, which is 
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labeled by Xj +2 t- But then, the sequence of such edges {eyjygz 
is a path in Q 3;t that generates x. It follows that x E X 3;t . 

In order to establish the inclusion X 3; t C T 3 * t , consider 
any element x = {xj}j e z of X 3} t and let {e;}j e z denote the 
path in Q 3; t that generates x. We again let Xj denote the sub- 
block {xj—tXj-t+\ ■ ■ ■ x j+2t) °f x. Then, it follows from our 
construction of Q 3; t that for all j E Z, the vertex at which ey+2t 
terminates must be the sequence x;. Therefore, Xj is a vertex 
in Q 3;t for all / 6 Z. Now, suppose we have k, I, m E supp(cc) 
such that max{|fc— l\, \l—m\, \m— k\} ^ f and xj- = x\ = x m . 
In order to prove that x E T£ t , we must show that x^ + i_ m ^ 0. 
Since k, I, m are all within a distance of f of each other, there 
exists a j E Z such that k,l,mE [j, j + f] . Observe that for any 
k,l,mE [j, j+t], the integer k + l — m lies in [j — t,j + 2t]. 
But now, since Xj is a vertex of Q 3; f, it follows from our defi- 
nition of the vertex set of Q 3)i that X/ c+ ;_ m 7^ 0. Thus x E T 3 * t , 
which shows that X 3 . t C 7^* f and completes the proof. ■ 

We now define 7^. f as the set of all finite-length sequences 
that are sub-blocks of some sequence in T 3 * t . Stated another 
way, Ty t is a subset of the set T 3 - t defined in SectionlTTI 
consisting of all finite-length sequences that a) satisfy the 
TGP(f) constraint and b) can be extended to a bi-infinite 
sequence that satisfies the TGP(f) constraint. It is possible 
that some finite-length sequences in T 3) t cannot be extended 
in this way, in which case 7^. ( is strictly smaller than T 3 - t . 

Corollary 14. Let X 3 -t denote the constrained system of the 
graph Qz- f. Then, for all t E Z + , we have 

X 3;f = % t - (54) 

Moreover, the graph Q 3;t is a deterministic presentation of its 
constrained system X 3; t = T 3 '. f . 

Proof. It should be obvious from our construction of Q 3 -t 
that outgoing edges at each vertex of Q 3;t are labeled distinctly. 
Hence Q 3;t is a deterministic presentation of its constrained 
system. Furthermore, it is well known (and obvious) that J53i 
implies (I54> . In the terminology of symbolic dynamics, the 
sets X3 ; j and T 3 '. t are precisely the languages of the sofic shifts 
X^-i and T 3 * t . Since the shifts are equal (by Theoremlljl. their 
languages must be also equal. ■ 

Corollarvll4l implies that we can find the capacity of T 3 '. t 
from the largest eigenvalue of the adjacency matrix of Q 3;t . 
However, we are not interested in T 3 '. t , but rather in the set 

B' 3 , = £,(%) = {*(*) : xE% t }. (55) 

Letting B 3 . t (n) denote the number of sequences of length n 
in B' 3 . t , we get the expression J5 1 1 for the capacity H' 3 (t). 

Remark. Here is a heuristic argument in support of our claim 
that H' 3 (t) is likely to be equal to H 3 (t). The difference 
between H 3 (t) and H 3 (t) stems from the difference between 
the sets TJ, t and T 3 . t . It is well known [13], [18], [21] that the 
capacity of a language is equal to the entropy of the underlying 
shift. Thus, instead of looking at T^. t , we might as well look at 
the underlying sofic shift X 3; t = T 3 * t . The TGP(t) constraint 
defining T 3 - t is a finite restriction of the TGP(t) constraint 



defining T 3 * t . Furthermore, the TGP(t) constraint is local, in 
the sense that it is defined through a finite window of length t. 

Now, it is generally observed in the literature [18] that if a 
constrained system S is obtained via a finite restriction of a 
local constraint that defines a sofic shift X, then the capacity 
of S equals the entropy of X . Of course, this is clearly true 
whenever any finite sequence in S can be extended to a bi- 
infinite sequence in X . However, "edge effects" sometimes 
make it impossible to extend certain sequences in S without 
violating the constraint. But, in the case of a local constraint, 
these edge effects are usually not strong enough to affect a sig- 
nificant proportion of the sequences in S, so that the capacity 
of S is still equal to the entropy of X. This is not always true, 
but the exceptions to this rule tend to be pathological. 

It may be possible to prove rigorously that H' 5 (t) = H$(t), 
but such a proof would have to deal in detail with the "edge 
effects" and is likely to be too tedious to be worth the effort. 

The remaining problem is to construct a deterministic 
presentation for the set £> 3 . t in (1521 and (I55> . Given the graph 
Q 3 -t, constructing a presentation for B' 3 . t is easy: simply apply 
£(•) to all the labels in Qy t . Specifically, let Q'y t denote the 
graph obtained from Q 3 -t by replacing the labels of all the 
edges with their absolute values. Then it is obvious from d55l 
and Corollary^Jthat the graph Q' 3 . t is a presentation for £> 3 . f . 

Note, however, that although Q 3 - t is a deterministic presen- 
tation of T^. t , the graph Q' 3 . t is not necessarily a deterministic 
presentation of B' 3 . t . Indeed, there may be two edges ema- 
nating from the same vertex x in Qy r t, one labeled with + 
and the other with — , whose labels in Q 3 . t would both be 
1. Fortunately, there is a well-known procedure that, given 
an arbitrary presentation of a constrained system, constructs 
a deterministic presentation for it. This procedure is called 
the subset construction method; it is described in detail in 
[21, Section2.2.1] and in [18, Theorem3.3.2]. Applying the 
subset construction method to the graph Q' 3 . t , we finally obtain 
a deterministic presentation Tiy r t for the set B'y v Given this 
presentation, we can compute the capacity H' 3 (t) and construct 
encoders into B' 3 . t , as described in the previous subsection. 

We can now summarize the entire procedure for computing 
the capacity H' 3 (t), as follows: 

|TJ Construct the graphs Q 3; t and Q' 3 , t as described above, 
and let B' 3 . t be the constrained system presented by Q' 3 . t . 

|~2~| Apply the subset construction method to Q'y t in order to 
obtain a deterministic presentation T~Lj,-t for £> 3 . f . 

|~3~| Construct the adjacency matrix A 3)i of Ti^t, and compute 
its largest eigenvalue A = A(A 3 - t ). Set H' 5 (t) = log 2 A. 

Of course, in theory, 7i 3; t can also be used to construct finite- 
state encoders for converting unconstrained binary sequences 
to sequences in B' 3 . t C B 3 -t, as explained in Section IIV-CI 
In turn, the graphs Qy^ and Q' 3 . t provide a method for trans- 
forming a binary sequence y E B' 3 . t into a ternary sequence x, 
with E,{x) = y, that satisfies the TGP(f) constraint. For each 
given y E B' 5 . t , there is a path in Q 3 . t whose label sequence is 
y. We may then take x to be the sequence of labels along the 



Kashyap, Siegel, and Vardy: THE GHOST-PULSE CONSTRAINT 



13 



same path in Q^. t . The practicality of this method depends on 
the existence of a systematic procedure for finding a path in 
Q' 3 . t that generates y. Of course, it also depends on the order 
of the graphs 7Y 3;f , Q' 3 . t , and <? 3;f . 

We have already observed that the order of Q^-t and Q' 3 . t is 
exponential in t. However, since we are interested primarily 
in small values of t, such exponential growth could still be 
tolerated. The main computational problem is with the subset 
construction method at Step 2 above. The subset construction 
technique, when applied to a graph with n vertices, produces 
a graph with 2" — 1 vertices. As a result, the graph 7^3 ;t 
constructed in Step 2 has at least 2 9 ' vertices. In fact, this 
is likely to be a vast underestimate of the order of H^f. 

V. Summary 

We have defined and analyzed a number of "ghost-pulse" 
constraints that can be used to design coding schemes which 
mitigate the formation of ghost pulses in the optical fiber 
channel. We show that coding schemes based upon sequences 
that satisfy the binary ghost-pulse (BGP) constraint must 
necessarily have poor rates, since the capacity of this constraint 
is zero. Sequences satisfying a more relaxed constraint, which 
we call the BGP(f) constraint, are more suitable for use as 
codes; however, the rate of such codes is still too low for 
practical applications. A more promising approach is to use 
the phase-modulation idea, which leads to ternary constraints. 
Thus we study the ternary ghost-pulse (TGP) and TGP(f) 
constraints. We leave the analysis of the TGP constraint with 
unbounded memory as an open problem, conjecturing that it 
has zero capacity. But we do provide a detailed analysis of 
the TGP(l) and TGP(2) constraints. Our analysis suggests 
that coding schemes using TGP(f) -constrained sequences 
can achieve much higher rates than those using BGP(t)- 
constrained sequences. We are therefore led to believe that 
TGP(f) constraints yield reasonably efficient schemes for mit- 
igating the ghost-pulse problem. We also discuss the design of 
encoders and decoders for coding schemes involving the BGP, 
the BGP(£), and the TGP(f) constraints. While the procedures 
we suggest for coding into the BGP(f), TGP(l), and TGP(2) 
constraints can be implemented in practice, the corresponding 
design procedure for the general TGP(f) constraint with t ^ 3 
is too computationally intensive to be implementable in its 
present form. 
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